High-fidelity facial avatar reconstruction from a monocular video is a significant research problem in computer graphics and computer vision. Recently, Neural Radiance Field (NeRF) has shown impressive novel view rendering results and has been considered for facial avatar reconstruction. However, the complex facial dynamics and missing 3D information in monocular videos raise significant challenges for faithful facial reconstruction. In this work, we propose a new method for NeRF-based facial avatar reconstruction that utilizes 3D-aware generative prior. Different from existing works that depend on a conditional deformation field for dynamic modeling, we propose to learn a personalized generative prior, which is formulated as a local and low dimensional subspace in the latent space of 3D-GAN. We propose an efficient method to construct the personalized generative prior based on a small set of facial images of a given individual. After learning, it allows for photo-realistic rendering with novel views and the face reenactment can be realized by performing navigation in the latent space. Our proposed method is applicable for different driven signals, including RGB images, 3DMM coefficients, and audios. Compared with existing works, we obtain superior novel view synthesis results and faithfully face reenactment performance.
translated by 谷歌翻译
我们研究如何代表具有隐式神经表示(INRS)的视频。经典INRS方法通常利用MLP将输入坐标映射到输出像素。尽管最近的一些作品试图直接使用CNN重建整个图像。但是,我们认为,以上像素和图像策略都不利于视频数据。取而代之的是,我们提出了一个贴片解决方案PS-NERV,该解决方案将视频表示为贴片的函数和相应的补丁坐标。它自然继承了图像方法的优势,并以快速解码速度实现出色的重建性能。整个方法包括常规模块,例如位置嵌入,MLP和CNN,同时还引入了ADAIN以增强中间特征。这些简单而基本的更改可以帮助网络轻松拟合高频细节。广泛的实验证明了其在几个与视频有关的任务中的有效性,例如视频压缩和视频介绍。
translated by 谷歌翻译
Recently, model-based agents have achieved better performance than model-free ones using the same computational budget and training time in single-agent environments. However, due to the complexity of multi-agent systems, it is tough to learn the model of the environment. The significant compounding error may hinder the learning process when model-based methods are applied to multi-agent tasks. This paper proposes an implicit model-based multi-agent reinforcement learning method based on value decomposition methods. Under this method, agents can interact with the learned virtual environment and evaluate the current state value according to imagined future states in the latent space, making agents have the foresight. Our approach can be applied to any multi-agent value decomposition method. The experimental results show that our method improves the sample efficiency in different partially observable Markov decision process domains.
translated by 谷歌翻译
多助理系统(MAS)之间代理之间的合作已成为近年来的热门话题,并提出了许多基于分散执行(CTDE)的集中培训的算法,例如VDN和QMIX。但是,这些方法忽略了隐藏在各个动作值中的信息。在本文中,我们提出了超图卷积混合(HGCN-MIX),这是一种与价值分解的超图卷积的方法。通过将动作值视为信号,HGCN-MIX旨在通过自学习超图探讨这些信号之间的关系。实验结果表明,HGCN混合匹配或超越了在各种情况下的星际争霸II多智能挑战(SMAC)基准中的最先进的技术,特别是那些具有许多药剂的赛车。
translated by 谷歌翻译
基于示例的基于彩色方法依赖于参考图像来为目标灰度图像提供合理的颜色。基于示例的颜色的关键和难度是在这两个图像之间建立准确的对应关系。以前的方法已经尝试构建这种对应关系,而是面临两个障碍。首先,使用用于计算对应的亮度通道是不准确的。其次,它们构建的密集信件引入了错误的匹配结果并提高了计算负担。为了解决这两个问题,我们提出了语义 - 稀疏的彩色网络(SSCN)以粗细的方式将全局图像样式和详细的语义相关颜色传输到灰度图像。我们的网络可以完全平衡全局和本地颜色,同时减轻了暧昧的匹配问题。实验表明,我们的方法优于定量和定性评估的现有方法,实现了最先进的性能。
translated by 谷歌翻译
Recently, some challenging tasks in multi-agent systems have been solved by some hierarchical reinforcement learning methods. Inspired by the intra-level and inter-level coordination in the human nervous system, we propose a novel value decomposition framework HAVEN based on hierarchical reinforcement learning for fully cooperative multi-agent problems. To address the instability arising from the concurrent optimization of policies between various levels and agents, we introduce the dual coordination mechanism of inter-level and inter-agent strategies by designing reward functions in a two-level hierarchy. HAVEN does not require domain knowledge and pre-training, and can be applied to any value decomposition variant. Our method achieves desirable results on different decentralized partially observable Markov decision process domains and outperforms other popular multi-agent hierarchical reinforcement learning algorithms.
translated by 谷歌翻译
深度加强学习(DRL)的框架为连续决策提供了强大而广泛适用的数学形式化。本文提出了一种新的DRL框架,称为\ emph {$ f $-diveliventcence加强学习(frl)}。在FRL中,通过最大限度地减少学习政策和采样策略之间的$ F $同时执行策略评估和政策改进阶段,这与旨在最大化预期累计奖励的传统DRL算法不同。理论上,我们证明最小化此类$ F $ - 可以使学习政策会聚到最佳政策。此外,我们将FRL框架中的培训代理程序转换为通过Fenchel Concugate的特定$ F $函数转换为鞍点优化问题,这构成了政策评估和政策改进的新方法。通过数学证据和经验评估,我们证明FRL框架有两个优点:(1)政策评估和政策改进过程同时进行,(2)高估价值函数的问题自然而缓解。为了评估FRL框架的有效性,我们对Atari 2600的视频游戏进行实验,并显示在FRL框架中培训的代理匹配或超越基线DRL算法。
translated by 谷歌翻译
作为分散的部分观察到的马尔可夫决策过程(DEC-POMDP)问题的解决方案之一,最近的价值分解方法已经实现了显着的结果。然而,大多数值分解方法需要在训练期间的环境完全可观察状态,但这在一些场景中是不可行的,在某些情况下可以获得不完整和嘈杂的观察。因此,我们提出了一种新颖的值分解框架,命名为值分解(侧)的状态推断,这消除了通过同时寻求最佳控制和状态推断的两个问题来了解全局状态的需要。侧面可以扩展到任何值分解方法,以解决部分可观察的问题。通过比较星际II微型管理任务中的不同算法的性能,但我们验证了没有可访问状态,方面可以推断基于过去的本地观测的增强学习过程,甚至在一些基础上实现卓越的结果复杂的情景。
translated by 谷歌翻译
时尚预测学习(ST-PL)是具有许多应用的热点,例如物体运动和气象预测。它旨在通过观察到的序列来预测后续帧。然而,连续框架中固有的不确定性加剧了长期预测的难度。为了解决预测期间增加的歧义,我们设计CMS-LSTM,专注于上下文相关性和多尺度的时空流,详细含有两种精细植入的本地,其中包含两个精心设计的块:上下文嵌入(CE)和时尚表达(SE)块。 CE专为丰富的上下文互动而设计,而SE专注于隐藏状态的多尺度时空表达。新引入的块还促进了其他时空模型(例如,PEIPrn,SA-COMMLSTM),以产生ST-PL的代表性隐式特征,提高预测质量。定性和定量实验证明了我们所提出的方法的有效性和灵活性。具有较少的参数,CMS-LSTM在两个代表性基准和场景上的指标中占据了最先进的方法。
translated by 谷歌翻译
Dynamic Graph Neural Networks (DGNNs) have been broadly applied in various real-life applications, such as link prediction and pandemic forecast, to capture both static structural information and temporal characteristics from dynamic graphs. Combining both time-dependent and -independent components, DGNNs manifest substantial parallel computation and data reuse potentials, but suffer from severe memory access inefficiency and data transfer overhead under the canonical one-graph-at-a-time training pattern. To tackle the challenges, we propose PiPAD, a $\underline{\textbf{Pi}}pelined$ and $\underline{\textbf{PA}}rallel$ $\underline{\textbf{D}}GNN$ training framework for the end-to-end performance optimization on GPUs. From both the algorithm and runtime level, PiPAD holistically reconstructs the overall training paradigm from the data organization to computation manner. Capable of processing multiple graph snapshots in parallel, PiPAD eliminates the unnecessary data transmission and alleviates memory access inefficiency to improve the overall performance. Our evaluation across various datasets shows PiPAD achieves $1.22\times$-$9.57\times$ speedup over the state-of-the-art DGNN frameworks on three representative models.
translated by 谷歌翻译